Search CORE

17 research outputs found

Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs

Author: Wang Jiahua
Wu Qitian
Yan Junchi
Yang Chenxiao
Publication venue
Publication date: 05/06/2023
Field of study

Graph neural networks (GNNs), as the de-facto model class for representation learning on graphs, are built upon the multi-layer perceptrons (MLP) architecture with additional message passing layers to allow features to flow across nodes. While conventional wisdom commonly attributes the success of GNNs to their advanced expressivity, we conjecture that this is not the main cause of GNNs' superiority in node-level prediction tasks. This paper pinpoints the major source of GNNs' performance gain to their intrinsic generalization capability, by introducing an intermediate model class dubbed as P(ropagational)MLP, which is identical to standard MLP in training, but then adopts GNN's architecture in testing. Intriguingly, we observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training. This finding sheds new insights into understanding the learning behavior of GNNs, and can be used as an analytic tool for dissecting various GNN-related research problems. As an initial step to analyze the inherent generalizability of GNNs, we show the essential difference between MLP and PMLP at infinite-width limit lies in the NTK feature map in the post-training stage. Moreover, by examining their extrapolation behavior, we find that though many GNNs and their PMLP counterparts cannot extrapolate non-linear functions for extremely out-of-distribution samples, they have greater potential to generalize to testing samples near the training data range as natural advantages of GNN architectures.Comment: Accepted to ICLR 2023. Codes in https://github.com/chr26195/PML

arXiv.org e-Print Archive

Advective Diffusion Transformers for Topological Generalization in Graph Learning

Author: Bronstein Michael
Nie Fan
Wu Qitian
Yan Junchi
Yang Chenxiao
Zeng Kaipeng
Publication venue
Publication date: 10/10/2023
Field of study

Graph diffusion equations are intimately related to graph neural networks (GNNs) and have recently attracted attention as a principled framework for analyzing GNN dynamics, formalizing their expressive power, and justifying architectural choices. One key open questions in graph learning is the generalization capabilities of GNNs. A major limitation of current approaches hinges on the assumption that the graph topologies in the training and test sets come from the same distribution. In this paper, we make steps towards understanding the generalization of GNNs by exploring how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies. We first show deficiencies in the generalization capability of existing models built upon local diffusion on graphs, stemming from the exponential sensitivity to topology variation. Our subsequent analysis reveals the promise of non-local diffusion, which advocates for feature propagation over fully-connected latent graphs, under the assumption of a specific data-generating condition. In addition to these findings, we propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations that have a closed-form solution backed up with theoretical guarantees of desired generalization under topological distribution shifts. The new model, functioning as a versatile graph Transformer, demonstrates superior performance across a wide range of graph learning tasks.Comment: 39 page

arXiv.org e-Print Archive

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Author: He Yixuan
Wipf David
Wu Qitian
Yan Junchi
Yang Chenxiao
Zhao Wentao
Publication venue
Publication date: 23/01/2023
Field of study

Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations. To this end, we introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states that progressively incorporate other instances' information by their interactions. The diffusion process is constrained by descent criteria w.r.t.~a principled energy function that characterizes the global consistency of instance representations over latent structures. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs, which gives rise to a new class of neural encoders, dubbed as DIFFormer (diffusion-based Transformers), with two instantiations: a simple version with linear complexity for prohibitive instance numbers, and an advanced version for learning complex structures. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks, such as node classification on large graphs, semi-supervised image/text classification, and spatial-temporal dynamics prediction.Comment: Accepted by International Conference on Learning Representations (ICLR 2023

arXiv.org e-Print Archive

Localized Contrastive Learning on Graphs

Author: Wang Yu
Wu Qitian
Yan Junchi
Yu Philip S.
Zhang Hengrui
Zhang Shaofeng
Publication venue
Publication date: 08/12/2022
Field of study

Contrastive learning methods based on InfoNCE loss are popular in node representation learning tasks on graph-structured data. However, its reliance on data augmentation and its quadratic computational complexity might lead to inconsistency and inefficiency problems. To mitigate these limitations, in this paper, we introduce a simple yet effective contrastive model named Localized Graph Contrastive Learning (Local-GCL in short). Local-GCL consists of two key designs: 1) We fabricate the positive examples for each node directly using its first-order neighbors, which frees our method from the reliance on carefully-designed graph augmentations; 2) To improve the efficiency of contrastive learning on graphs, we devise a kernelized contrastive loss, which could be approximately computed in linear time and space complexity with respect to the graph size. We provide theoretical analysis to justify the effectiveness and rationality of the proposed methods. Experiments on various datasets with different scales and properties demonstrate that in spite of its simplicity, Local-GCL achieves quite competitive performance in self-supervised node representation learning tasks on graphs with various scales and properties

arXiv.org e-Print Archive

The mechanisms of Yu Ping Feng San in tracking the cisplatin-resistance by regulating ATP-binding cassette transporter and glutathione S-transferase in lung cancer cells

Author: Chan Kelvin (R15157)
Du Yingqing
Hu Weihui
Huang Yamiao
Li Yafang
Liu Xi
Tsim Karl W. K.
Wu Baomeng
Xie Venus W.
Xu Qitian
Yin Jiachuan
Yu Ciel X.
Zeng Liyi
Zha Guangcai
Zhan Xingri
Zhang Zhenxia
Zheng Yuzhong
Zhong Lishan
Zhu Elsa W.
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2021
Field of study

Cisplatin is one of the first line anti-cancer drugs prescribed for treatment of solid tumors; however, the chemotherapeutic drug resistance is still a major obstacle of cisplatin in treating cancers. Yu Ping Feng San (YPFS), a well-known ancient Chinese herbal combination formula consisting of Astragali Radix, Atractylodis Macrocephalae Rhizoma and Saposhnikoviae Radix, is prescribed as a herbal decoction to treat immune disorders in clinic. To understand the fast-onset action of YPFS as an anti-cancer drug to fight against the drug resistance of cisplatin, we provided detailed analyses of intracellular cisplatin accumulation, cell viability, and expressions and activities of ATP-binding cassette transporters and glutathione S-transferases (GSTs) in YPFS-treated lung cancer cell lines. In cultured A549 or its cisplatin-resistance A549/DDP cells, application of YPFS increased accumulation of intracellular cisplatin, resulting in lower cell viability. In parallel, the activities and expressions of ATP-binding cassette transporters and GSTs were down-regulated in the presence of YPFS. The expression of p65 subunit of NF-κB complex was reduced by treating the cultures with YPFS, leading to a high ratio of Bax/Bcl-2, i.e. increasing the rate of cell death. Prim-O-glucosylcimifugin, one of the abundant ingredients in YPFS, modulated the activity of GSTs, and then elevated cisplatin accumulation, resulting in increased cell apoptosis. The present result supports the notion of YPFS in reversing drug resistance of cisplatin in lung cancer cells by elevating of intracellular cisplatin, and the underlying mechanism may be down regulating the activities and expressions of ATP-binding cassette transporters and GSTs

PubMed Central

Western Sydney ResearchDirect

Handling Distribution Shifts on Graphs: An Invariance Perspective

Author: Wipf David
Wu Qitian
Yan Junchi
Zhang Hengrui
Publication venue
Publication date: 07/05/2022
Field of study

There is increasing evidence suggesting neural networks' sensitivity to distribution shifts, so that research on out-of-distribution (OOD) generalization comes into the spotlight. Nonetheless, current endeavors mostly focus on Euclidean data, and its formulation for graph-structured data is not clear and remains under-explored, given two-fold fundamental challenges: 1) the inter-connection among nodes in one graph, which induces non-IID generation of data points even under the same environment, and 2) the structural information in the input graph, which is also informative for prediction. In this paper, we formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM), that facilitates graph neural networks to leverage invariance principles for prediction. EERM resorts to multiple context explorers (specified as graph structure editers in our case) that are adversarially trained to maximize the variance of risks from multiple virtual environments. Such a design enables the model to extrapolate from a single observed environment which is the common case for node-level prediction. We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution and further demonstrate its power on various real-world datasets for handling distribution shifts from artificial spurious features, cross-domain transfers and dynamic graph evolution.Comment: ICLR2022, 30 page

arXiv.org e-Print Archive

Multisensory information facilitates the categorization of untrained stimuli

Author: Fu Qiufang 1 , 2
Jing Liping
Li Qitian
Rose Michael
Wu Jie
Publication venue: 'Brill'
Publication date: 01/01/2021
Field of study

Although it has been demonstrated that multisensory information can facilitate object recognition and object memory, it remains unclear whether such facilitation effect exists in category learning. To address this issue, comparable car images and sounds were first selected by a discrimination task in Experiment 1. Then, those selected images and sounds were utilized in a prototype category learning task in Experiments 2 and 3, in which participants were trained with auditory, visual, and audiovisual stimuli, and were tested with trained or untrained stimuli within the same categories presented alone or accompanied with a congruent or incongruent stimulus in the other modality. In Experiment 2, when low-distortion stimuli (more similar to the prototypes) were trained, there was higher accuracy for audiovisual trials than visual trials, but no significant difference between audiovisual and auditory trials. During testing, accuracy was significantly higher for congruent trials than unisensory or incongruent trials, and the congruency effect was larger for untrained high-distortion stimuli than trained low-distortion stimuli. In Experiment 3, when high-distortion stimuli (less similar to the prototypes) were trained, there was higher accuracy for audiovisual trials than visual or auditory trials, and the congruency effect was larger for trained high-distortion stimuli than untrained low-distortion stimuli during testing. These findings demonstrated that higher degree of stimuli distortion resulted in more robust multisensory effect, and the categorization of not only trained but also untrained stimuli in one modality could be influenced by an accompanying stimulus in the other modality.</p

Institutional Repository of Institute of Psychology, Chinese Academy of Sciences

Trading Hard Negatives and True Negatives: A Debiased Contrastive Collaborative Filtering Approach

Author: Chen Guihai
Gao Xiaofeng
Jin Jipeng
Pan Junwei
Wu Qitian
Yang Chenxiao
Publication venue
Publication date: 01/07/2022
Field of study

Collaborative filtering (CF), as a standard method for recommendation with implicit feedback, tackles a semi-supervised learning problem where most interaction data are unobserved. Such a nature makes existing approaches highly rely on mining negatives for providing correct training signals. However, mining proper negatives is not a free lunch, encountering with a tricky trade-off between mining informative hard negatives and avoiding false ones. We devise a new approach named as Hardness-Aware Debiased Contrastive Collaborative Filtering (HDCCF) to resolve the dilemma. It could sufficiently explore hard negatives from two-fold aspects: 1) adaptively sharpening the gradients of harder instances through a set-wise objective, and 2) implicitly leveraging item/user frequency information with a new sampling strategy. To circumvent false negatives, we develop a principled approach to improve the reliability of negative instances and prove that the objective is an unbiased estimation of sampling from the true negative distribution. Extensive experiments demonstrate the superiority of the proposed model over existing CF models and hard negative mining methods.Comment: in IJCAI 202

arXiv.org e-Print Archive